A significant level of stigma and inequality exists in mental healthcare, especially in under-served populations, which spreads through collected data. When not properly accounted for, machine learning (ML) models learned from data can reinforce the structural biases already present in society. Here, we present a systematic study of bias in ML models designed to predict depression in four different case studies covering different countries and populations. We find that standard ML approaches show regularly biased behaviors. However, we show that standard mitigation techniques, and our own post-hoc method, can be effective in reducing the level of unfair bias. We provide practical recommendations to develop ML models for depression risk prediction with increased fairness and trust in the real world. No single best ML model for depression prediction provides equality of outcomes. This emphasizes the importance of analyzing fairness during model selection and transparent reporting about the impact of debiasing interventions.
translated by 谷歌翻译
Over the last years, topic modeling has emerged as a powerful technique for organizing and summarizing big collections of documents or searching for particular patterns in them. However, privacy concerns arise when cross-analyzing data from different sources is required. Federated topic modeling solves this issue by allowing multiple parties to jointly train a topic model without sharing their data. While several federated approximations of classical topic models do exist, no research has been carried out on their application for neural topic models. To fill this gap, we propose and analyze a federated implementation based on state-of-the-art neural topic modeling implementations, showing its benefits when there is a diversity of topics across the nodes' documents and the need to build a joint model. Our approach is by construction theoretically and in practice equivalent to a centralized approach but preserves the privacy of the nodes.
translated by 谷歌翻译
音频或视觉数据分析任务通常必须处理高维和非负信号。然而,当数据具有多维数减少预处理时,大多数数据分析方法遭受过度拟合和数值问题。此外,关于如何以及为什么滤波器为音频或可视应用的方式工作是所需的属性,特别是当涉及能量或频谱信号时。在这些情况下,由于这些信号的性质,滤波器重量的非承诺是所需的性质,以更好地理解其工作。由于这两个必需品,我们提出了不同的方法来减少数据的维度,而保证溶液的非承诺和可解释性。特别是,我们提出了一种广义方法,以在处理非负数据的应用程序中以监督方式设计过滤器银行,并且我们探讨了解决所提出的目标函数的不同方式,包括非负面的部分最小二乘法的非负图。我们分析了通过拟议的两种不同和广泛研究的应用方法获得的特征的辨别力:纹理和音乐类型分类。此外,我们比较我们的方法实现的滤波器银行,具体设计用于特征提取的其他最先进的方法。
translated by 谷歌翻译
多变量分析(MVA)包括用于特征提取的众所周知的方法,该方法提取,其利用表示数据的输入变量之间的相关性。大多数此类方法享有的一个重要属性是提取特征之间的不相关性。最近,MVA方法的正则化版本在文献中出现,主要是为了获得解决方案的解释性。在这些情况下,不再以封闭的方式获得解决方案,并且经常使用更复杂的优化方法,依赖于两个步骤的迭代。本文回到了替代方法来解决这个迭代问题。这种方法的主要新颖性在于保持原始方法的几个属性,最值得注意的是提取特征的不相关性。在此框架下,我们提出了一种新的方法,该方法利用L-21规范在特征提取过程中执行变量选择。不同问题的实验结果证实了与现有化配方的拟议配方的优点。
translated by 谷歌翻译
自适应滤波器处于许多信号处理应用的核心,从声噪声繁殖到回声消除,阵列波束形成,信道均衡,以更新的传感器网络应用在监控,目标本地化和跟踪中。沿着该方向的趋势方法是重复到网络内分布式处理,其中各个节点实现适应规则并将它们的估计扩散到网络。当关于过滤方案的先验知识有限或不精确时,选择最适当的过滤器结构并调整其参数变得有挑战性的任务,并且错误的选择可能导致性能不足。为了解决这个困难,一种有用的方法是依赖自适应结构的组合。自适应滤波器的组合在某种程度上利用相同的鸿沟和征服机器学习界(例如,袋装或升级)成功利用的原则。特别地,在不同的视角下,在计算学习领域中研究了组合若干学习算法的输出(专家的混合):而不是研究混合物的预期性能,衍生出适用于各个序列的确定性范围因此,反映了最糟糕的情况。这些界限需要与通常在自适应滤波中使用的那些不同的假设,这是该概述文章的重点。我们审查了这些组合计划背后的关键思想和原则,重点是设计规则。我们还通过各种示例说明了它们的性能。
translated by 谷歌翻译